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^\ ' Abstract 

We establish the result of the title. 

In combinatorial terms this has the implication that for sufficiently small e > 0, for all n, any 
marking of an e fraction of the vertices of the n-dimensional hypercube necessarily leaves a vertex 
| x such that marked vertices are a minority of every sphere centered at x. 

j * . 

1 Introduction 

Let I" be the n-dimensional hypercube: the set {0, 1}" equipped with Hamming metric d{x,y) = \{i : 
x i 7^ Vi)\- Let V = M 1 be the vector space of real-valued functions on the hypercube. For x g I™, let 
1 n x denote the evaluation mtyQ from V — >• K defined by 7^/ = f(x), for / G V. If .4 C Hom(V, V), 

the maximal operator Mj± : V — > V is the sublinear operator in which M^f is defined by 

KxM A f = SUp TT x Af (1) 

ov 

Of interest is the family S — {Sk}k=o of spherical means, the stochastic linear operators Sk ■ V — > V 
given by 



TT x S k f = ^ Wv ^\k 

{y:d(x,y)=k} 

Applying M , we have the spherical maximal operator M$ '■ V — > V defined by 

■K x M s f = max TLrSfc/ (2) 

0<k<n 

The result of this paper is the following dimension-free bound. 
Theorem 1. There is a constant A\ such that for all n, ||A/,s||2->2 < Aj. 

Equivalently, for all n and /, HM5/H2 < Ai||/|| 2 . 

Maximal inequalities for various function spaces such as L p (M. n ), and with stochastic linear operators 
defined by various distributions such as uniform on spheres (as above), uniform on balls, or according 
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to the distribution of a random walk of specified length (ergodic averaging), have been extensively 
studied; see [9] for a good review. Most previous work does not explicitly consider finite or discrete 
metric spaces; however, see |10j for a maximal inequality on the free group on finitely many generators. 

One may ask whether the hypercube bound should follow from known results for larger spaces. The 
hypercube of dimension greater than 1 does not embed isometrically in Euclidean space of any dimen- 
sion [21 [8], so inequalities for Euclidean spaces do not seem to be a useful starting point. The hypercube 
does embed isometrically in R™ with the L\ norm, but there is no maximal inequality for this metric. 
To see this, still in the context of discrete metric spaces, consider the space Z 2 with the L\ distance. 
Fixing any nonnegative integer TV let / be the indicator function of {x : x i — 0,J2\ x i\ — %N + 1}. 
Then H/lll = 2N + 1 while \\M s f\\l € Q{N 2 ). A similar gap (between O^™" 1 ) an d fi(iV")) occurs 
in any fixed dimension n, because there exists a set of size 0(N n ^ 1 ) constituting a positive fraction 
of fi(7V n ) Li-spheres, necessarily of many radii. It is therefore not the L\ metric structure of the 
hypercube which makes a maximal inequality possible, but, essentially, its bounded side-length. 

1.1 Combinatorial interpretation 

In the special case that / is the indicator function of a set of vertices F in I n , Theorem [T] has the 
following consequence: For nonnegative e less than some £o and for all n, if \F\ < e2" then there 
exists x € I" such that in every sphere about x, the fraction of points which lie in F is 0{yfe). 

The aspect of interest is that this holds for every sphere about x. The analogous claim for a fixed 
radius is a trivial application of the Markov inequality; by a union bound the same holds for any 
constant number of radii. Avoiding the union bound is the essence of the maximal inequality. 

The combinatorial interpretation also has an edge version. Let F' be a set of edges in I n . Let the 
distance from a point to an edge be the distance to the closer point on that edge. Theorem [T] has 
the following consequence: For nonnegative e less than some E\ and for all n, if < en2 n then 
there exists x £ I™ such that in every sphere about x, the fraction of edges which lie in F' is 0(y/e). 
(Define a function / on vertices by f(y) = the fraction of edges adjacent to y that lie in F' . Note 
that || /|| 2 & 0{y/e). Apply Theorem [1] to /. For the desired conclusion observe that in the sphere of 
edges of distance k from x, the fraction of edges lying in F' is bounded for k < n/2 by 2-K x Skf ', and 
for k > n/2 by 2ir x Sk+if-) 

Theorem Q] has the following consequence in the theory of computation: a significant open problem 
is whether there is a polynomial-time algorithm for the UGC constraint satisfaction problem on the 
hypercube. One proposed algorithm is a simple propagate-from-random-sources process. Absence 
of a maximal inequality would have provided an adversary defeating this algorithm. (It should be 
noted that propagating from a single random source is inadequate, as an e fraction of vertices can be 
removed to partition the hypercube into linearly-many components, all of less than constant fractional 
size. However, for purpose of the UGC, even a partition into 0(n) components does not rule out a 
polynomial-time algorithm propagating from many random sources.) 

1.2 Possible generalizations 

Let G = (V,E) be any finite connected graph, with shortest-path metric da- Let G Dn be the nth 
Cartesian power of G, the graph on V n in which (v,w) is an edge if there is a unique i for which 
(vi,Wi) 6 E. The shortest path metric on G Dn is therefore the L\ metric induced by dc- Spherical 
operators and spherical maximal operators are now defined, and we conjecture that Theorem [1] holds 
for a suitable constant Aq. 

In a different direction, the existence of a dimension-free bound for all I™ begs the question whether 
there is a natural limit object in which each n occurs as a special case. 



2 



1.3 Proof overview 



Our proof is in two main steps, in each of which we obtain a maximal inequality for one class of 
stochastic operators based on comparison with another more tractable class. To introduce the first 
of these reductions we need to define the senate operatoi^\ Sen. Let T = {Tk} be any family of 
stochastic operators indexed by a parameter k which varies over an interval [0, a] (a possibly infinite) 
of either nonnegative reals or nonnegative integers. (E.g., S = {Sk}o as above.) Then the family 
Sen(T) = {Sen(T)fc}, indexed by k in the same range, consists of the stochastic operators 

Sen(T) fe = 7^— J2 Te or Scn ( r )fc = \ f T e M 

£=0 J° 

depending as k ranges over integers or reals, and taking the limit from above at k = in the continuous 
case. 

In the first step of our argument we follow a comparison method due to Stein [12] to show that 
Proposition 2. ||M S || 2 _> 2 G 0(1 + ||M Sen(s) || 2 _> 2 ). 

Bounds on the Krawtchouk polynomials play a key role in this argument. We will introduce the 
polynomials and prove these bounds in Sec. [21 and then use them to prove Proposition [2] in Sec. [3] 

To introduce the second reduction we need to define the family of stochastic noise operators N = 
{N t }t>o indexed by real t. Letting p = (1 — e~*)/2, we set 

N t = ir( n k )p k (i-pT- k s k 

This has the following interpretation. n x Ntf is the expectation of Tr y f where y is obtained by running 
n independent Poisson processes with parameter 1 from time to time t, and flipping the ith bit 
of x as many times as there are events in the ith Poisson process. The N t 's form a semigroup: 
Nt±Nt 2 — N tl +t 2 - The process is equivalent to a Poisson-clocked random walk on the hypercube. 

We show (in Sec. 2]) by a direct pointwise comparison that: 

Proposition 3. ||Mscn(S)||2^2 G O ( 1 1 Ms en ( AT) 1 1 2^-2 ) ■ 

Finally (see [6]) \\M Scn(N) || 2 _>. 2 < 2^2 (indeed, || M Scn(N) \\ p ^ p < {p/{p-l)) 1/p for p > 1) by previous 
results: the Hopf-Kakutani-Yosida maximal inequality and Marcinkiewicz's interpolation theorem. 
For the reader's convenience, we restate these results in the Appendix. 

Combining these results, we have Theorem Q] by ||M S /|| 2 G 0(||/|| 2 + ||M Se n(S)/||2) G 0(||/|| 2 + 
l|M SenW /|| 2 ) G 0(||/|| 2 ). 

Remark 4. While our main result is in terms of the 2 — > 2 norm, many of our techniques generalize 
to other norms. Here we are limited by Proposition^ which does not conveniently generalize to other 
norms, although we suspect this restriction is not fundamental. By contrast Proposition\^holds for any 
norm, even though we have stated it for simplicity with respect to the 2 — > 2. And the Dunford- Schwartz 
maximal inequality yields a bound on the 1 — > l,w norm, which by Marcinkiewicz interpolation yields 
a bound on the p — > p norm for any p > 1. Here, we define \\f\\i lW := sup A>0 A||/[/ > A]||i, where /[■] 
denotes an indicator function. It is possible that a dimension-independent maximal inequality holds 
for spherical operators on the hypercube in the 1 — > 1, to norm. On the other hand, \\Ms\\x-*i = n+1, 
as can be seen by taking f to be nonzero only on a single point. 

We are concerned in this paper solely with maximal operators for sets A of nonnegative matrices. 
For any such maximal operator M_4, \TT x M^f\ < 7r x M4.|/|- So it suffices to show Theorem [T] for 
nonnegative /; this simplifies some expressions and will be assumed throughout. 

2 The terminology is to express that (as in the United States Senate) each block has equal weight without regard to 
size. 
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2 Fourier analysis and Krawtchouk polynomials 



For y £ I", define the character \y G ^ by n x Xy — (— l) a: ' y /v / 2™- The normalization is chosen so 
that the x y form an orthonormal basis of R 1 . This basis simultaneously diagonalizes each S k , as they 
commute with I™ as an abelian group. A direct calculation (see also [3]) shows that 

SkXy = 4 (\y\)Xy, (3) 
where \y\ — \{i : y.; = 1}|, and 4T^(M) is the normalized k th Krawtchouk polynomial, defined by 



i)\k-.j) 



4 n) W = E(- 1 ) iAi 7^ ( 4 ) 

3=0 \k) 



We collect here some facts about Krawtchouk polynomials. 
Lemma 5. 

1. k-x Symmetry: n k n \x) — Kx (k) 

2. Reflection symmetry: K^\n — x) = (— l) k Kf?(x). 



3. Orthogonality: 



±^\x)4 n \x)(^j=^S k , e (5) 



x=0 



4-. Roots: The roots of k^\x) are real, distinct, and lie in the range n/2 ± \Jk(n — k). 

The proofs of the first three claims are straightforward (see [7]), and the fourth claim is a weaker 
version of Theorem 8 of [5]. We sometimes abbreviate Kk(x) = k£ (x). 

Before going into further technical detail, we give an overview of our goals in this section. As we have 
noted in Sec.[T] maximal inequalities are easily proved for semigroups, such as the noise operators N t . 
In some ways Sk resembles Nk/ni since Nt is approximately an average of S k for k — nt± y/nt(l — t). 
While direct comparison is difficult (e.g. writing Sk as a linear combination of Nt necessarily entails 
large coefficients), we can argue that the spectra of these operators should be qualitatively similar. 

Indeed, the Nt are also diagonal in the \y basis, and for \y\ = x, their eigenvalue for \y is (1 — 2t) x . 
Thus, our goal in this section will be to show that Kk(x) has similar behavioiH to (1 — 2k/n) x . More 
precisely, we prove 

Lemma 6. There is a constant c > such that for all n and < x, k < n/2, 

l4 n) (z)|<e- cfa /™. (6) 
Due to Lemma [511121 it suffices to bound K k {x) only when < k < x < n/2. 

Proof. The main complication in working with Krawtchouk polynomials is that they have several 
different forms of asymptotic behavior depending on whether x and k are in the lower, middle or 
upper part of their range; indeed, [T] breaks the asymptotic properties of Kk(x) into 12 different cases. 
However, for our purpose, we need only two different upper bounds on the Krawtchouk polynomials. 



3 This qualitative similarity breaks down when k (or x) is close to n/2, although this case will not be the main 
contribution to our bounds. See the introduction to for discussion of the properties of the S n /2 operator. 
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Case I: k > O.Un. 

This is the simpler upper bound, which relies only on the orthogonality property Lemma 15131 Setting 
k = £ in Lemma l5l3l and observing that all of the terms on the LHS are nonnegative, it follows that 

2 n 



Stirling's formula implies that (») > J^^Cl^^l (where H 2 (p) = -plgp - (1 - 
p)lg(l — p)); for sufficiently large n (say n > no) we can generously cover the error terms (for all 
1 < pn < n — 1) by (^) > 2 . (We use lg for the base-2 logarithm and log for base e.) 

Note that if 2 (0.12) > 1/2, so, forfc,x > O.Un, we have 4 (x) < 1Tn 2( 1 - 2H ^- li )) n < 7rn 2(i-2H2(o.i2)-c 1 )n < 
7rn2~ Cin for some ci > 0. Now let ri\ > no be sufficiently large that (log2)ci > 21og(7mi)/ni; 
then for all n > m, 7m2~ Cl ™ < 2~ Cl "/ 2 . So for all n > n x and all O.Un < k,x < n/2, n 2 k (x) < 

2-2ci(n/2)(n/2)/n <- 2~ 2c i kx / n 

To handle the n <n\ case, we define C2 = min{— (n/kx) log |k[™' ) (x)| : 1 < fc < a; < n/2,n < n\}. It 
is immediate from Definition (j4} that | (x) | < 1 if 1 < k, x < n — 1, so c% > 0. 
Finally, the lemma follows with c = min{2ci, C2}. 
Case II: k < O.Un. 

It is convenient to make the change of variable 

x = (1 — z)n/2, 

set 

^(z) = K fe ((l - z)n/2), 

and expand as fik{z) — X)j=o a k.iZ l ■ p/c is either symmetric or anti-symmetric about 0, and we 
focus on bounding \fik\ in the range < z < 1 corresponding to < x < n/2. 

Let yi, . . . , yk be the roots of By Lemma 15121 the multiset {yi, . . . , y^} is identical to the multiset 
{— 2/1, . . . , — 2/fc}. So we can write 

k 

A{z)=al k \{{z 2 -y 2 l ). (8) 

i=l 

It is immediate from Definition (|4]) that 

4(1) = 1 (9) 

Furthermore, by Lemma 15141 

2 

2/max := maxj/j < -y/k(n- k) (10) 

i n 

We now obtain an upper bound on p 2 (z) simply by maximizing subject to the constraints §§§ and 
(ji"0)) . Observe that 

fe 



2 (z) _ /4(f) TT z -!/■ 



Consider the problem of choosing y± to maximize a single term \z 2 — yf\ / (1 — y 2 ) . Observe that 

d z 2 — yf z 2 — 1 



dy 2 1 - y 2 (1 - ^ 2 



<0. (11) 
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As a result, the maximum over \yi\ < z is found at yi — and the maximum over > z is found at 
Vi = Vmax- In the former case, \z 2 — yf\/{l — yf) = z 2 . In the latter case, 

\ Z ~ Vi I ^ 2/max ~ Z / Vmax 



^ ^ < « < »max < 93 _ 

Vi 2/max ^max 



The last inequality uses the fact that y max < 2-^0.14 • 0.86 (recalling that k < 0.14n). So \z 2 — yf\/(l — 
yf) < max(z 2 , 0.93), implying that 

f4(z) < (max(z 2 ,0.93)) fe (12) 
If z 2 > 0.93 then we use z = 1 - 2x/n < e - 2x / n to obtain k 2 (x) < e - ixk / n . 

If z 2 < 0.93 then (recalling x < n/2) we have n 2 (x) < (0.93) fc < (0.93) 2xk ^ n = e - cxk / n for c = 
-21n(0.93). ' □ 

3 Senates dominate dictatorships: proof of Proposition [2] 

Proposition [2] (restatement). \\M S \\ 2 ^ 2 e 0(1 + \\M Scn{s) \\ 2 ^ 2 ). 

We start by defining S = {Ssjjj • The operator M-g : V — > V is then defined by 

■KxM-gf = max ir x S k f 

Letting t be the antipodal involution in I™ (equivalently, we could use the spherical mean S n ), 
KxMgf = max{ir x Mgf : iT x iAIgf}. So ||Ms||2->-2 < V%\\ ^%-|| 2 ->2- Proposition [5] therefore follows 
from: 

Claim 7. There is a C < oo suc/i t/iat /or all n and f, || A%/|| 2 < C||/|| 2 + ||Af S en(S)/ll2- 
We prove the claim in the following two subsections. 

3.1 A method of Stein 

The bounds on ||<S^||2->2 for even and odd radius I are technically distinct (but not in any interesting 
way). We present the arguments in parallel. 

3.1.1 Even radius: S 2r for < r < r max = [[n/2\/2\ 

Note that for n = Am + a, < a < 3, this gives r max = to. 
Abel's lemma gives the following easily verified identity: 

1 - 1 r 

S% r — r / S 2 k = — — / ^ k{S 2 k - SWfc-i)) (13) 

r + 1 ' r + 1 ' 

fe=0 fc=l 

Hence we have the following pointwise (that is to say, valid at each point x) inequality for < r < r max , 
' max — LL»/2J/2J: 



G 



[ir x {S 2r - —^-^^S 2k )f] 2 = [^T ■ Vk[ir x (S 2k - S 2 (k-l))f}] 



k=0 



fe=l 



- Yl ( r + i)2 X! k i 7r x{S2k - S 2 (k-i))f} 2 by Cauchy-Schwartz 
fc=i ^ r ' fe=i 

r 

= 2(r + 1) S fc M^2fc - ^2(fc^i))/] 2 



fc=i 

This suggests denning an "error term" operator i? : V — > V by 

K x Rof 

so that for any r < r max , 



1 r max 

\ o - ^(fe-i))/] 2 

\ k=i 



||S2r - Sen(5) 2r ||^ < \\R f\\ 2 2 



3.1.2 Odd radius: 5 2r+ i for < r < r max = [{[n/2\ - 1)/2J 



Note that for n = Am + a, < a < 3, this gives r max 
Abel's lemma gives: 

1 r 1 

s ' 2r+1 ~ 7TT S 5,2,1+1 = S k ( s ' 



m-l if a e {0,1} 
m if a e {2, 3} 



2fc+l 



<S 2 fc_l) 



fe=0 fe=l 

Hence we have the following pointwise inequality for < r < r max = [(L n /2J — 1)/2J : 

[ir x (s 2r +i ^-r^2 S 2k+1 \ f} 2 = [^^y-- Vk[TT x (S 2k+1 - S 2k -i).f}]' 

\ r + fc=o / fe=i r 

1 r max 

••• < X fc[7ra;(52fe+l - 5 2 fe-l)/] 2 

^ fc=l 

This suggests defining an "error term" operator i?i : F — > V by 

71"x-Rl/ 



so that for any r < r n 



\ ^ fc=l 



||5 2r+1 -Sen(5) 2r+1 ||2 < \\Rrfg 



(14) 



(15) 



(16) 



(17) 



(18) 



3.2 Bounding the error term 

Claim [7] (and Proposition [5]) now follow from: 

Lemma 8. There is a C < oo such that ||i?o/|j2, H-R1/H2 < C||/||2- 

Proof. As seen in the preliminaries, the operators Sk commute and share the eigenvectors {Xy}y& n i 
with eigenvalues given by Eqn <j3j> : SkXy — K ^^\y\)Xy Let ^ x ^ e ^ ne projection operator on the 
(|™|)-dimensional eigenspace spanned by {Xy\\y\=x] so Sk = 2"=o K k^( x )Ex- We calculate: 

\\Roff 2 = E^E fc[((52fc_52(fc - i))/)(z)]2 

zei n k=i 

1 f max 71 

fc=l x=Q 

1 71 "/"max 

Likewise: (here and below the value of r max depends on whether Rq or i?i is being bounded) 

z£l n k=l 

= \ E 11^/111 E *(«&U*) - 4^(*)) 2 

x=0 fc=l 



Since ||/||| = 53™=o II-^/IIIj ^ suffices to show that there is a C < 00 such that for every < x < n, 



max 



E ^( K 2fc ( X ) K 2(k~l)( X )) 



fe=l 



E ^( K 2fc+l( a; ) K 2fc-l( X )) 



fe=l 



< c. 



(19) 



Recall that it suffices by Lemma T5I2I to consider x < n/2. For x — 0, (fT!?)) is trivial as the LHS is 0. 
For x > we use Lemma I5TT1 to rewrite the parenthesized term (with £ — 2k or I — 2k + 1) as follows: 



4 n) W - 4 n) (* - 1) = - 1) = - 1) 

\x) n 

To see this, recall that (™)kx (£) counts x-subsets of {l,...,n} according to the parity of their 
intersection with {1, . . . , £}; now condition on whether the :r-subset contains the element £. 

Consequently, 

4") (£) - - 2) = - 1) + fe 1 ^ - 2)) 

which by a similar argument is 

_ M) . _ 4a (n-g) (w - 2)r . 

The two terms on the LHS of equation Q1J] are now 
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£i^>(2fe)-/^(2*-2)) a = E fc (^3#--i 2) ( 2fc - 2 )) 2 
fc=i fe=i ' 

r gfc(4" ) (2fc + i)-4" ) (2fc-i)) 2 = EM^#4r 1 2) (2fc-i)) 2 (22) 

; — i — i V / 



(21) 



For i = l, quantities (|21I22[) come to X^=T 16fc« 2 which is upper bounded by a constant. 
For i>lwe upper bound (I21l22j) . first, by 



^£*(ei a) (2*-2)) a and ^ £ *(^(2* - 1)) S 
fc=i fe=i 



which in turn are upper bounded by (applying each value of r max ): 

^|>/2+i)(e?w- 

fe=0 

Now apply Lemma [5] to upper bound this by 

i^y(fc/2 + l)e- 2c( ^ 1)fe/( "- 2) 
fe=o 

1fi r 2 °L 

' J2( k / 2 + 1 ) e ~ ak defining a = 2c(x-l)/(n- 2) 



i 2 
,.2 



n 



X 16(10+8^ uging th(; . dent . ty y, = ^ _ _ 2 



< 24 

< 24 



ll-< 

, 2 



n(l - e~ tt ) 

X N 2 



2cn(x — 1) 
< 24/c 2 

completing the requirement of Eqn. 1191 □ 

4 Comparing senate maximal functions: proof of Proposi- 
tion in 

Proposition [3] (restatement). || A/ Scn(5 ) || 2 _>2 € 0(||Af Son(A r)||2^ 2 ). 
Proof. The proof relies on pointwise comparison of maximal functions. 

If A, B are matrices, write A < B \i B — ^4 is a nonnegative matrix. If A, B are sets of nonnegative 
matrices indexed by integers or R, we write A < B if for every A £ A there is a probability measure 
/i,4 on /3 such that A < J B dfj,A(B). Observe that in this case for any nonnegative function / and 
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any x, sup^g^ tt x Af < swp BeB ir x Bf, and therefore for any norm, ||Ma|| < \\M B \\ (and in particular 

for || • ||2->a)< 

For any k > [n/2\ , 

ir x Sen(S) k f < n x Sen(S) L „/2j (/ + t/) 

Therefore ||Ms en (5) || 2-J.2 < 2|| A^sen(3') II 2— ^-2- However, we will not compare Sen(5) and Sen(TV) directly. 
Instead, we will introduce a variant of N that more closely resembles S, but is no longer a semigroup. 

Recall that N t represents the average over independently flipping each bit with probability p = 
(1 — e~*)/2. Define N p to represent the same noise process but parameterized by p instead of t. Thus 
N t = N j-e-t and N p = N_ ln ( 1 _ 2p )- While the sets {N t }t>o and {N p } p ^ 01 / 2 ) are of course the same, 

their Senate operators Sen (A/ - ) and Sen (A/ - ) are different: 



r 



Sen(A/> = 4 / N P dp for P e (0, 1/2) (23) 



1 f T „ . 1 



Sen(AA) T = -J N t dt=-J^ Y^2~p Np dp ° < T (24) 

Hence Proposition |3] is established in two subsidiary claims: 
Lemma 9. Scn(A0 < Scn(AA). 

Lemma 10. Sen(5) < C ■ Sen(A/") for some constant C > 

□ 

Proof of Lemma\Q We will identify, for each P, a distribution \ip such that 

Sen(A0 P < J np(T)Sen(Af) T dT (25) 
In fact, we will achieve equality in (|23|) . by a careful choice of /ip. 

To compare Sen (A/ - ) with Sen (A/ - ), we think of both as convex combinations of various N p operators. 
However, Sen(A/")p is a "flat" distribution in which N p appears with weight 1/P (for < p < P), 
and Sen(A/")T is biased towards larger values of p in that N p appears with weight proportional to 
1/(1 — 2p). We will use \ip to compensate for this bias by writing Sen(A/")p as a range of different 
choices of Sen(Af)r- Formally, this means we need to choose /j,p so that for all p > it solves the 
following integral equation: 

lip < Pi 2 f°° 1 , N 

^ ~ J = / -»p(T) dT 26 

where the indicator function I[E] is defined to be 1 when the event E holds and otherwise. 

We are able to solve ([26)) using (informally speaking) a continuous version of a greedy algorithm. First 
we place as much weight as we need on T — — ln(l — 2P) to cover Np. Then we continue to p < P, 
placing only enough weight on Sen(A/)_i n (i_2p) to bring the total weight of N p up to 1/P. 

The solution of (f26|) will involve putting a possibly nonzero amount of mass at T = — ln(l — 2P) and 
spreading the rest out over < T < — ln(l — 2P). More precisely, define 

Te~ T 1 — 2P 

MT) := -2p-I(T < "Ml - 2P)) + • - 2P))5_ ln(1 _ 2P) (T). (27) 
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It is a straightforward calculation to verify that p,p(T) satisfies (j2"6]l and that / °° dT/ip(T) = 1 for 
all P e (0, 1/2). This implies that Sen(TV) < Sen(TV) and thus that ||M Son(Xr) || 2 ^2 < ||M Sen (j^||2-^2, 
completing the proof of Lemma [9] □ 

We remark that the intuition of a "greedy algorithm" described above could be made rigorous, and 
would yield an alternate proof, by replacing the continuous range of p that we consider with a discrete 
set {P/L, 2P/L, . . . , P}, for some large integer L. Since N p is a continuous function of p, the error 
of this approximation can be shown to go to zero as L — > oo. As a result, the greedy algorithm is 
well-defined and finite. 

Proof of Lemma \TUl To compare Sen(<S) and Sen (TV), we need to show that for any K < n/2, we can 
find a distribution over P such that Sen(<S)#- is pointwisc < the appropriate average over Sen(TV)p 
times a constant. In fact, it will suffice to consider a distribution that is concentrated on a single value 
of P. Define P K := min( g+ ra V1? , ±). In LerrrmaQIJ we will show that Sen{S) K < C ■ Sen(Af) Pl< , thus 
implying that Sen(S) < C ■ Sen(TV). The idea behind Lemma ITTI is that for each k < K, there are 
significant contributions to the Sk coefficient of Sen(TV) p K for p throughout the range [k /n, (k+yk) /n] . 
This window has width yfk/n, contributes n(l/vfc) weight to Sk at each point and is normalized by 
■p- w The total contribution is thus 0,(1/ K). 

Lemma 11. Let n>9,K< n/2 and P K = mxa( K+ / W , \). Then Sen{S) K < 3e 20 • Sen(TV)p K) 
The true constant is certainly much better, and perhaps closer to 1/2. 
Proof. Observe that P := P K < 2K/n. 

For < k < K, we now compare the coefficient of Sk in Sen(<S)/f (where it is l/(K + 1)) to its value 
in Sen(TV)p, where it is -p J Q dp B(n,p,k). Denote this latter quantity by a*,. 

Consider first k = 0. If K = and P = 0, then 5*0 has weight 1 in both cases. Otherwise P > 1/n, 
and 

1 f 1/n n 1 / 1\" 1 



P J Q " r ' r ~ 2K n V nj ~ 8K 

This last inequality uses the fact that (1 — l/n) n is increasing with n and thus is > 1/4 for n > 2. 

Next, suppose k + Vk < §. Then P > k+ /f^ ■ We will consider only the contribution to resulting 
from < p — k/n < sfk/n. First, use Stirling's formula to bound 



> e~xp(nH2(k/n)). —— — - > exp(nH2(k/n)) 



implying that 



Next, observe that 



kj ~ ^ ^ 1 "\j9k{n-k) v ' V l "3Vk' 



Ti i 
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When k/n < q < p < (k + Vk)/n, we have k — nq > k — np > —\fk and g(l — q) > £(1 — ~) > 
Thus, < p — k/n < \fk/n implies that 



B(n,p,k) f p ^ k — nq 
B(n,^,k) J k/n q q(l-q) 



ln Jt,, = / d ?^T-^T (30a) 



(30b) 

> -2 (30c) 



Combining ((28]) and ([30]). we find that B(n,p, k) > l/6e 2 Vk for k/n < p < (k + \/k)/n. Thus 

1 r( fc +v^)/« „ ^ i 



a k > t: / dp-B(n, p, fc) > — — • • — > —77; , 

" Py fc/n V ' 2K n Qe^y/k ~ Ue 2 K 

Finally, we consider the case when k + \fk > n/2. In this regime, we have P = 1/2. Also, k < n/2, 
so K > k > n/2 — \Jn/2 > n/4 (assuming in the last step that n > 8). 

Consider p£ [1/2 - l/s/n, 1/2]. Assume that n > 9 so that 1/2 - 1/^/n > 1/6. We can then use ([29]) 
to bound 



Similarly 

i?(n,i,fc) > / 1 _ fc\ fc-n/2 
n i?(n,^,fc)-U 1/12 " 



We conclude that 



1 f! Ill 

afe>n/ dpB(n,p,k) >2--=- ——= > 



2 \/n 

□ 

This completes the proof of Lemma [TO] □ 



A Maximal inequalities on semigroups 

In this appendix, we review the maximal ergodic inequality that we use to show ||Mg e n(jV) < !• 

Theorem 12 (maximal ergodic inequality). Let A be a positive sublinear operator with \\A\\i^i < 1 
and ||A|| 00 _ S . 00 < 1. Let A denote the (discrete) semigroup generated by a. Then 

\\Msen(A)\\l->l,w < 1- 

This inequality was originally proved by Kakutani and Yosida (in the case when A is linear) and Hopf 
(for general semigroups). However since the proof of [1] is simple and self-contained, we restate it 
here. 
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Proof. For / an arbitrary function and T > 0, define Ej = I[Ms e n(A)< T f > 0]. We will first prove 
that 

(£/,/> >0, (31) 

for all t. To see that the theorem follows from this claim, apply (|3"TT) to / — A and we find that 
X\\Ej_ x \\i < (Ef_ x J) < H/lli. Thus ||Sj_ A ||i < A-^l/llx. Since this inequality holds for all T, it 
implies that \\I[M Scn(A) f > A]||i is also < A _1 ||/||i. 

We now return to the proof of (|3T1) . We define for this purpose an unweighted Senate operator: 



t=o 



We abbreviate M T := M 



Son(yl)< T 



and M T := Ah 



Scn(^l)< T - 



Observe that I[M T f > 0] = 7[M T / > 0]. 
Also define, for any function g, the function (g) + to be the nonnegative part of g. 

Thus, if < t < T, then we have Sen(A) t f < M T f < (Mrf) + - This implies that 

/ + A(M T f)+ >f + AS^(A) t f = S^(A) t+ if. 

Thus, / > Sen(^4)t/ — A((Mrf) + ), (including an t = case that can be checked separately) and 
maximizing over < t < T, we have 



Now we take the inner product of both sides with Ej and find 



/ > M T f - A((M T /)+). 
J s 



(Ej,f) > (Ej,M T f - A((M T f) + )) 

= (Ej, (M T f)+ - A((M T f)+)) 
= \\(M T f)+\\ 1 -(Ef,A((M T f)+)) 



> KMrfT 

> 



\A{{M T f)+)\\ 



because ||^4|| 



< 1 



(32) 
(33) 
(34) 
(35) 
(36) 

□ 



For our purposes, we will want to convert the bound on the 1 — > 1, w norm into a bound on p p 
norms, especially for p = 2. This is achieved by the Marcinkiewicz interpolation theorem[13j. 

Theorem 13. Let A be a sublinear operator with \\A\\ p ^ p ^ w < N p and \\ A\\ q -^ qtW < N q . Then for any 
p < r < q, we have 



< 2 



~(q -p) 



(r-p)(q-r) 



l/r 



In our case, a maximal operator M_4 has ||-M^t|| 00 _ J , 00)TO = 1 and if A is a positive contractive semigroup, 
then Theorem [T2l implies that ||-M_4||i_j,i )TO < 1 as well. This implies that for any 1 < p, we have 



\\Ma\ 



< 2 



p-1 



l/p 



(37) 



which is 2\/2 when p = 2. 
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